Bidirectional Long-Short Term Memory for Video Description

机译：视频描述的双向长短期记忆

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Video captioning has been attracting broad research attention in multimediacommunity. However, most existing approaches either ignore temporal informationamong video frames or just employ local contextual temporal knowledge. In thiswork, we propose a novel video captioning framework, termed as\emph{Bidirectional Long-Short Term Memory} (BiLSTM), which deeply capturesbidirectional global temporal structure in video. Specifically, we first devisea joint visual modelling approach to encode video data by combining a forwardLSTM pass, a backward LSTM pass, together with visual features fromConvolutional Neural Networks (CNNs). Then, we inject the derived videorepresentation into the subsequent language model for initialization. Thebenefits are in two folds: 1) comprehensively preserving sequential and visualinformation; and 2) adaptively learning dense visual features and sparsesemantic representations for videos and sentences, respectively. We verify theeffectiveness of our proposed video captioning framework on a commonly-usedbenchmark, i.e., Microsoft Video Description (MSVD) corpus, and theexperimental results demonstrate that the superiority of the proposed approachas compared to several state-of-the-art methods.

机译：视频字幕已在多媒体社区中引起了广泛的研究关注。但是，大多数现有方法要么忽略视频帧之间的时间信息，要么仅采用本地上下文时间知识。在这项工作中，我们提出了一种新颖的视频字幕框架，称为\ emph {双向长短时记忆}（BiLSTM），它可以深深地捕获视频中的双向全局时间结构。具体来说，我们首先设计了一种联合视觉建模方法，通过结合前向LSTM传递，后向LSTM传递以及卷积神经网络（CNN）的视觉特征来对视频数据进行编码。然后，我们将派生的视频表示形式注入到后续的语言模型中进行初始化。好处有两个方面：1）全面保留顺序和视觉信息；和2）分别自适应地学习视频和句子的密集视觉特征和稀疏表示。我们在常用的数据库基准（即Microsoft视频描述（MSVD）语料库）上验证了我们提出的视频字幕框架的有效性，并且实验结果表明，与几种最新方法相比，该提议方法的优越性。

著录项

作者
Bin, Yi; Yang, Yang; Huang, Zi; Shen, Fumin; Xu, Xing; Shen, Heng Tao;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. A novel automatic image caption generation using bidirectional long-short term memory framework [J] . Ye Zhongfu, Khan Rashid, Naqvi Nuzhat, Multimedia Tools and Applications . 2021,第17期

机译：使用双向长短短期内存框架的新型自动图像字幕生成
2. Polluted gas quantitative detection in multi-gas sensor based on bidirectional long-short term memory network [J] . Jiangying Liu, Lei Cheng, Dong Yao, International Journal of Modelling, Identification and Control . 2020,第1期

机译：基于双向长短期内存网络的多气体传感器污染的气体定量检测
3. A supervised deep convolutional based bidirectional long short term memory video hashing for large scale video retrieval applications [J] . Digital Signal Processing . 2020,第期

机译：基于监督的基于深度卷积的双向短期内记忆视频，用于大规模视频检索应用
4. Generating video description with Long-Short Term Memory [C] . Shuohao Li, Jun Zhang, Qiang Guo, International Conference on Image, Vision and Computing . 2016

机译：使用长期记忆生成视频描述
5. Quantitative Trading Portfolio Optimization-Based Stock Prediction Using Long-Short Term Memory Network [D] . Hao, Ruizhi. 2021

机译：基于量化的贸易组合优化使用长短期内存网络的库存预测
6. Forecasting stock prices with long-short term memory neural network based on attention mechanism [O] . Jiayu Qiu, Bin Wang, Changjun Zhou 2020

机译：基于注意机制的长短期内存神经网络预测股票价格
7. Short-term wind power forecasting using long-short term memory based recurrent neural network model and variable selection [O] . Umit Cali, Vinayak Sharma 2019

机译：基于长短期内存的经常性神经网络模型和可变选择的短期风力预测

Bidirectional Long-Short Term Memory for Video Description

摘要

著录项

相似文献

相关主题

期刊订阅